fix(observer): ocr_enabled flag — bypass screencapture popup storm#10
Merged
Conversation
User report (2026-05-02): after merging Step 5 and starting codec-observer,
the screencapture-based OCR fired the macOS Screen Recording permission
popup repeatedly — "over a hundred times" per user. Each click of "Allow"
on the popup grants only one-time access; the persistent grant is the
toggle in System Settings → Privacy & Security → Screen & System Audio
Recording.
Compounding bug in codec_observer._get_screenshot_ocr:
with ThreadPoolExecutor(max_workers=1) as ex:
future = ex.submit(_screencapture_and_ocr_blocking)
return future.result(timeout=timeout_s)
The `with` exit calls executor.shutdown(wait=True) which BLOCKS until
the thread finishes. When the popup appears, screencapture's subprocess
hangs waiting for the user, and the thread can't finish. Result: my
"100ms timeout" actually waited up to 5 seconds (full subprocess.run
timeout chain on screencapture+osascript), and BOTH the first attempt
AND the Q5.1 retry trigger their own popup. Per poll: ~2 popups +
5s blocking until user dismisses.
Hotfix: add `ocr_enabled` config flag (default True — preserves Step 5
design behavior on permission-granted installs) that, when set false,
skips _get_screenshot_ocr entirely. Buffer still captures active_window
+ clipboard + recent_files; only OCR is bypassed. No screencapture call,
no popup risk.
~/.codec/config.json:
"observer": {
"ocr_enabled": false ← user can flip to disable OCR per-machine
}
3 new tests in tests/test_observer.py:
- test_ocr_enabled_false_skips_screenshot_call (sentinel-based: raises
if _get_screenshot_ocr is called when flag is false)
- test_ocr_enabled_default_is_true (preserves existing behavior)
- test_ocr_enabled_true_calls_screenshot (sanity check)
Result: 33/33 tests passing in 0.21s. Zero side effects.
Permanent fix for OCR users: grant Screen Recording AND Accessibility to
both `node` (PM2 parent) and `python3.13` (codec-observer process) in
System Settings → Privacy & Security. Then the toggle stays on across
restarts.
Deeper subprocess-leak issue (ThreadPoolExecutor.with-exit blocks on
shutdown wait=True) is NOT fixed in this commit — would require restructuring
the OCR call to use Popen with explicit kill on timeout, or
ThreadPoolExecutor.shutdown(wait=False, cancel_futures=True) (Python 3.9+).
Tracked for follow-up. The ocr_enabled flag is the safe immediate fix.
This was referenced May 2, 2026
AVADSA25
pushed a commit
that referenced
this pull request
May 2, 2026
…(Steps 5/6/7) Phase 2 (Observer + Triggers + Shift Report) merged and production-stable: - Step 5 (PR #9 824a52f) + hotfix PR #10 (26e6add) — `observer.ocr_enabled` flag - Step 6 (PR #11 2d2ff3f) — Trigger System (matcher + cooldown + consent) - Step 7 (PR #12 0e40687) — end-of-day shift report Net: +91 passing tests (823/20/73), 0 new failures, 0 new skips. Live audit proof captured: shift_report_started+_completed paired emits at 2026-05-02T18:49:40Z with shared cid=5f188e5485e5.
AVADSA25
pushed a commit
that referenced
this pull request
May 29, 2026
#10) - tests/test_a12_invariant.py: fails if a new inline `.post(...chat/completions ...)` appears outside codec_llm. Allowlists the documented vision sites (dashboard, watcher, both bridges, screenshot_text) + codec_core's generated session-script string. Runs in CI via the full pytest suite (no separate workflow step needed). Includes a synthetic-violation test proving the guard is not a no-op. - tests/test_oauth_provider.py: regression tests for the refresh-token scope-escalation defense (requested scopes must be a subset of the original grant) — previously untested. permission_gate mutation cases are already covered by test_agent_runner.py (8 tests incl. D-5 traversal/symlink); not duplicated. The flaky time.sleep -> threading.Event refactor is deferred — no specific flaky test identified, and rewriting passing tests without evidence risks regressions. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
AVADSA25
added a commit
that referenced
this pull request
May 29, 2026
…al + highs) (#154) * fix(concurrency): bound wall-clock time with run_with_timeout (C4) The idiom `with ThreadPoolExecutor() as ex: fut.result(timeout=T)` defeats its own timeout: the context-manager __exit__ calls shutdown(wait=True), which blocks until the runaway task finishes. A 50ms timeout on a 5s task took ~5s, so codec_mcp tool dispatch and codec_observer OCR could hang on a slow skill or a stuck screencapture popup. Add codec_concurrency.run_with_timeout (daemon thread + join(timeout), no shutdown(wait=True)): surfaces the timeout promptly, abandons the daemon worker (never blocks shutdown), re-raises fn exceptions, and raises TimeoutError (== concurrent.futures.TimeoutError on py3.11+). Migrate both call sites; mirrors the proven codec_hooks._run_hook_with_timeout pattern. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * fix(state): serialize SQLite + JSON read-modify-writes (C5/H14/M6) CodecMemory shared one sqlite3 connection across threads (check_same_thread=False) with no app-level lock — concurrent save/search on the same connection can corrupt cursor state or even segfault (the new test reproduced a hard crash before the fix). Add a threading.RLock and serialize every runtime connection use (save, search, search_recent, get_sessions, cleanup, rebuild_fts, close). RLock so get_context->search and cleanup->close don't self-deadlock; WAL + busy_timeout stay the cross-process layer. Route the unguarded ~/.codec JSON read-modify-writes through the cross-process file_lock: - grant_permission (routes/agents.py): new codec_agent_plan.grants_lock (mirrors _status_lock) held across load_grants -> modify -> save_grants -> set_grants_hash, so concurrent /grant calls can't clobber grants.json. - both notifications.json writers in codec_ask_user (question-write + mark-read) now also hold file_lock(NOTIFICATIONS_PATH), matching the pairing the pending_questions path already used. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * fix(ssrf): guard web_fetch + crew fetch against internal/metadata URLs (H1) Attacker-influenceable URLs (chat injection, clipboard, crew task) could be fetched against 127.0.0.1, 10/8, 192.168/16, 169.254.169.254 (cloud metadata), etc. — and the body flows back into the LLM transcript, so a read is an exfil path even with egress otherwise denied. Add codec_ssrf.validate_url: enforces http(s), resolves the host, and rejects if ANY resolved address is loopback / private / link-local / multicast / reserved / unspecified (dual-record + IPv4-mapped-IPv6 aware). Wire it BEFORE the HTTP call in skills/web_fetch.py (which clipboard_url_fetch delegates to) and in codec_agents._web_fetch. DNS-rebinding TOCTOU limitation documented. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * fix(skills): refuse approving a skill that shadows a pinned built-in (H2/H6) The review-and-approve flow wrote an approved skill to the skills dir by basename with no name-collision check. An approved skill named after a hash-pinned built-in (calculator.py, file_write.py, ...) would take that trusted name — shadowing the real built-in, or overwriting it if the write dir is the repo skills dir. skill_approve now refuses any filename matching a name in the committed skills/.manifest.json (case-insensitive, so the guard also holds on case-insensitive filesystems), emitting skill_approve_blocked. PR-1A's load-time hash/AST gate remains the last line of defense; this refuses at the write point so the trusted file is never displaced. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * test: A-12 inline-LLM-POST guard + OAuth scope-escalation coverage (Fix #10) - tests/test_a12_invariant.py: fails if a new inline `.post(...chat/completions ...)` appears outside codec_llm. Allowlists the documented vision sites (dashboard, watcher, both bridges, screenshot_text) + codec_core's generated session-script string. Runs in CI via the full pytest suite (no separate workflow step needed). Includes a synthetic-violation test proving the guard is not a no-op. - tests/test_oauth_provider.py: regression tests for the refresh-token scope-escalation defense (requested scopes must be a subset of the original grant) — previously untested. permission_gate mutation cases are already covered by test_agent_runner.py (8 tests incl. D-5 traversal/symlink); not duplicated. The flaky time.sleep -> threading.Event refactor is deferred — no specific flaky test identified, and rewriting passing tests without evidence risks regressions. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * docs: design notes for Fix #8 (routes/chat extraction) + Fix #9 (jsonstore registry) Design-only, per CLAUDE.md §11 + the SECURITY-REMEDIATION-DESIGN gate for the two remaining structural fixes. No production code. Each note grounds scope, seam, migration phasing, test plan, rollback, and open decisions in the actual HEAD surface (codec_dashboard chat cluster ~lines 2207-3005; ~30 ad-hoc json.dump writers + 2 duplicate atomic helpers). Awaiting sign-off before implementation. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * fix(state): codec_jsonstore Phase 0 hardening + notifications RMW lock + registry (C8) Fix #9, scoped to the high-value, low-risk core (recommended decisions applied): - Phase 0: atomic_write_json gains `default=` (custom serializer, e.g. str for datetime) and `sort_keys=` passthrough; new read_modify_write(path, fn) standardizes lock+load+mutate+atomic-write so a future RMW can't forget the file_lock. - Phase 2: routes/_shared._save_notification now holds the cross-process file_lock across its load->insert->write (it ran under the in-process _notif_lock only, so separate PM2 daemons clobbered each other). - Phase 5: docs/STATE-FILES.md registry of every ~/.codec state file + lock policy + migration status. Deferred to a documented follow-on (see STATE-FILES.md): Phase 1 mass overwrite-durability swaps; Phase 3 helper convergence (entangles the don't-touch pending_questions/grants write paths — byte-preserving, reviewed per-helper); Phase 4 don't-touch files (config.json, OAuth/Google tokens) with per-file sign-off. The broad json.dump CI ratchet was rejected as low-signal (rationale in STATE-FILES.md). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * refactor(dashboard): extract chat_completion phases into helpers (C9 start) Fix #8, intra-file CC reduction. The routes/chat.py module split is deferred: chat_completion shares _load_prompt_overrides + CHAT_SYSTEM_PROMPT with the prompt-override endpoints, so a clean split needs those relocated first (tracked in docs/FIX8-ROUTES-CHAT-EXTRACTION-DESIGN.md). Extract two of the heaviest blocks out of the 356-line chat_completion into named, independently-testable helpers — behavior-preserving (the 48-test chat/stream/budget/escalation baseline is byte-identical before/after): - _build_chat_system_prompt(config, budget, has_attachment, last_user_text): override + step-budget + attachment / content-rewrite / observer-inject logic. - _chat_vision_response(body, messages): the image -> vision-model branch. chat_completion: 356 -> 252 lines. Adds tests/test_chat_helpers.py (5 unit tests now possible because the logic is no longer trapped inside the handler). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * fix(ws): authenticate /ws/voice handshake (re-audit N1, Critical) AuthMiddleware subclasses Starlette BaseHTTPMiddleware, which only runs on the `http` scope — so the /ws/voice WebSocket handshake was completely unauthenticated. Combined with the voice pipeline running any trigger-matched skill (terminal/file_ops/pilot, no allowlist), an exposed dashboard (Cloudflare tunnel / dashboard_host=0.0.0.0) was an open, auth-free path to code execution. Add _ws_authorized() mirroring the HTTP AuthMiddleware Layers 0/1/2 (open when nothing configured; else dashboard_token via ?token=/Bearer, OR a valid TOTP-verified biometric session cookie) and reject before accept(). The middleware author already intended /ws to be gated (the 401 branch checks path.startswith("/ws")) — it just never executed on the websocket scope. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * fix(ssrf): re-validate every redirect hop in web_fetch + crew fetch (re-audit N3) Fix #7 validated only the INITIAL URL, but both fetch sites follow redirects by default (requests allow_redirects=True; httpx follow_redirects=True), so a public URL returning 302 -> http://169.254.169.254/ (cloud metadata) or http://127.0.0.1/ reached the internal target unchecked and returned its body into the LLM transcript — defeating the H1 guard. Both sites now follow redirects MANUALLY with allow_redirects/follow_redirects False, calling codec_ssrf.validate_url on every hop and capping at 5 redirects. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * fix(skills): resolve symlinked final component in file_write/file_ops (re-audit N4/N10) Both _is_safe_target (file_write) and _is_safe_path (file_ops) realpath-resolved only the PARENT dir then re-joined the raw basename, so a symlinked FINAL component (e.g. ~/Documents/x.md -> ~/.zshrc, or x.json -> ~/.codec/ oauth_state.json) passed the blocklist and open() followed it — a write-through to ~/.zshrc/~/.codec (RCE/secret overwrite) or a read-exfil of secrets over the MCP-exposed surface, defeating D-4. Realpath the FULL path in both safety checks (resolves a symlinked basename to its blocked target), and open()/read()/write() the resolved target so the op can't be redirected between check and open (TOCTOU). realpath handles not-yet-existing files. Matches codec_agents._file_write, which already did it. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * fix(auth): enforce TOTP on the ?s= session fallback (re-audit N5) AuthMiddleware's ?s=<token> GET fallback (for mobile img/stream URLs) checked only token existence + age — NOT totp_verified, unlike the cookie path. Since auth_verify/auth_pin hand out a usable session token BEFORE TOTP is completed, a pre-TOTP (or stolen pre-TOTP) token could reach any GET /api endpoint via ?s=, bypassing 2FA. Extract _session_token_valid() (existence + age + TOTP) and route BOTH the cookie path (_verify_biometric_session) and the ?s= fallback through it. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * fix: vision error-handling + observer sys.path leak + _research_jobs lock/TTL (re-audit N7/N8/N9) N7: _chat_vision_response runs OUTSIDE chat_completion's try/except, so a non-200/malformed vision-backend response surfaced as a raw 500. Guard the POST + parse and return a graceful 502. N8: codec_observer._ocr_call did sys.path.insert() twice per poll (up to ~5760 entries/day) — and the very next comment says no import happens. Dead + a slow unbounded leak; removed. N9: _research_jobs had no lock and no eviction (unlike _agent_jobs post-H-4) — unbounded memory growth + a dict-changed-size race vs the worker. Add _evict_stale_research_jobs (+ _research_jobs_lock) mirroring the agent-jobs pattern; deep_research_start evicts + guards the add/update under the lock. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * fix: builtins AST gap + fail-closed skill-approve + timing-safe PIN (re-audit N19/N20/N21) N19: add `builtins` to _SKILL_DANGEROUS_MODULES — `import builtins; builtins.exec(src)` bypassed the gate (builtins.exec is an Attribute call, not the bare-name Call the dangerous-calls check catches). N20: _pinned_builtin_names() returned an empty set on a manifest read failure, so skill_approve's pinned-name guard blocked nothing during a transient failure. Return None on failure and FAIL CLOSED (refuse the approve). N21: routes/auth.py compared the PIN hash with `==` (timing side-channel) — use hmac.compare_digest. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * build: pin CVE-safe dependency floors + fastmcp import-compat guard (Fix #6) Live-env investigation (read-only + staged, rolled back) found: the deployed python3.13 runs fastmcp 3.1.1 while requirements declares >=3.3.1, and an in-place upgrade to 3.3.1 left a broken install state on that interpreter. In a CLEAN env the code imports fine on 3.3.1 — so the fix is a clean reinstall, NOT a code port. The fastmcp SSRF (CVE-2026-32871) is in OpenAPIProvider, which CODEC does not use, so it isn't exploitable here (hygiene, not an emergency). - requirements.txt: add advisory-safe floors for the transitive/runtime deps that were unpinned — urllib3>=2.7.0, pillow>=12.2.0, cryptography>=46.0.7. (fastmcp>=3.3.1 and requests>=2.34.2 were already correct; the deployed interpreters were just stale vs the declared floors.) - tests/test_fastmcp_compat.py: assert fastmcp meets the 3.3.1 floor and that codec_mcp + codec_oauth_provider import against it — runs in CI and would have caught both the version drift and any future fastmcp API move. NO live mutation: the deployed venvs are untouched (rolled back to their original state). Refresh them at a maintenance window — see PR runbook. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * fix: skill_forge off chat allowlist + atomic config writes (re-audit mediums) CHAIN-002: skill_forge writes forged code to disk WITHOUT the review gate, yet was on CHAT_SKILL_ALLOWLIST — a prompt-injected [SKILL:skill_forge:...] tag could persist an unsandboxed skill. Removed it (+ the stale ask_codec_to_build entry that had no backing skill); create_skill stays as the review-gated path. Atomic config writes (truncated-read → empty-fallback race): route save_triggers (custom_triggers.json), custom-agent save, and update_schedule (schedules.json) through codec_jsonstore.atomic_write_json instead of truncate-then-write, so a concurrent reader can't catch a half-written file. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * docs: design note for the chat/MCP strict-consent gate (re-audit medium) The Step-3 strict-consent gate lives only in codec_agent_runner; the chat [SKILL:] + MCP tool paths reach high-power skills with only is_dangerous / path-blocklists (red-team CHAIN-001/002/006). This >1-module, UX-changing fix is design-gated per CLAUDE.md §11 — note covers the two chokepoints (codec_dispatch.run_skill, codec_mcp.tool_fn), the per-transport consent UX (MCP hard-refuse / chat prompt / voice announce), destructive classification, test/rollback, and the open decisions. No code until approved. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * feat(consent): MCP hard-refuse for destructive skills + classifier (B1/C) — and regen skill manifest Consent gate, increment 1 (Decisions B1 + C): - codec_consent: is_destructive_skill (per-skill SKILL_DESTRUCTIVE flag via the registry + _HTTP_BLOCKED backstop + known high-power built-ins) + gate_enabled kill switch (CONSENT_GATE_ENABLED). - codec_skill_registry: AST-extract SKILL_DESTRUCTIVE + get_destructive() accessor (parallel to SKILL_MCP_EXPOSE) — the extensible per-skill path. - codec_mcp.tool_fn: hard-refuse destructive skills over MCP (claude.ai can't consent at the operator tier) — closes the red-team's file_ops/imessage_send/ pilot-over-MCP chains; emits a `denied` audit. Also REGENERATES skills/.manifest.json: the earlier Fix #7 (web_fetch) and N4/N10 (file_write, file_ops) edits changed those built-ins' bytes, so their hash-pins were stale — the load-time gate would have AST-refused file_write/ file_ops (they import os) in production, and CI --check would fail. Manifest now matches the 76 built-ins. Chat-side consent (A1: consent_required + re-dispatch) is the next increment. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * feat(consent): chat-path consent gate via the AskUserQuestion panel (A2) Consent gate increment 2 (Decision A2 — reuse the proven Phase 1 Step 3 PWA consent panel, no new frontend). Both chat skill-runners now gate destructive skills before firing: - _try_skill (pre-LLM hijack) and _try_skill_by_name (post-LLM [SKILL:] tag — the prompt-injection vector) call codec_consent.chat_consent_ok, which for a destructive skill routes through codec_ask_user.ask(destructive=True, asked_from="chat") — literal verb-match consent rendered by the existing PWA panel; declined/timeout/ask_user-unavailable → skill skipped (fail-closed). - Non-destructive skills are unaffected (no prompt). Kill switch: CONSENT_GATE_ENABLED=false. With B1 (MCP hard-refuse) this closes the red-team's cross-cutting root cause: the chat [SKILL:] + MCP paths reached high-power skills with only is_dangerous. is_dangerous is retained as the layered terminal command heuristic (Decision D). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> --------- Co-authored-by: Mickael Farina <farina.mickael@gmail.com> Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
User report (2026-05-02 morning): after merging Step 5 and starting
codec-observer, the macOS Screen Recording permission popup fired "over a hundred times" — the user clicked Allow but it kept coming back.Root cause
Two-part bug in
_get_screenshot_ocr:macOS Allow is one-shot. Persistent grant requires the System Settings toggle, not the popup's Allow button. Clicking Allow on the popup only authorizes that single call.
ThreadPoolExecutor.__exit__blocks on shutdown. When the popup appears, screencapture's subprocess hangs waiting for the user. My code:__exit__callsshutdown(wait=True)— blocks until the thread finishes. The thread can't finish until the popup is dismissed. The "100ms timeout" actually waited ~5s (subprocess.run's own 2s+3s chain). And the Q5.1 retry triggered a SECOND popup. Per poll: ~2 popups + 5s blocking.Fix
Add
ocr_enabledconfig flag (defaultTrue— preserves Step 5 behavior on permission-granted installs). Whenfalse,poll()skips_get_screenshot_ocrentirely. Buffer still capturesactive_window+ clipboard + recent_files; only OCR is bypassed.Test plan
tests/test_observer.py:test_ocr_enabled_false_skips_screenshot_call— sentinel-based, raises if_get_screenshot_ocris called despite the flagtest_ocr_enabled_default_is_true— preserves existing behaviortest_ocr_enabled_true_calls_screenshot— sanity check for the default pathpytest tests/test_observer.py→ 33/33 passing in 0.21s. Zero side effects (Apple Reminders=0, /tmp/codec_*.txt=0).Deployment
This PR ships the code change. The user's runtime config has ALREADY been edited to
ocr_enabled: false(atomic write, backup at~/.codec/config.json.bak-1777736987). After merge:Observer resumes polling without OCR. No popups possible.
Permanent fix (out of scope for this hotfix)
Two follow-ups for a future PR:
nodeANDpython3.13via System Settings. Persistent grant survives restarts.ThreadPoolExecutor + future.result(timeout=...)withsubprocess.Popen+ explicitproc.kill()on timeout. The current shutdown-wait pattern is the underlying root cause; theocr_enabledflag is just the immediate off-switch.🤖 Generated with Claude Code